Skip to content

Conversation

@DrewKimball
Copy link
Collaborator

builtins: decrease wait time for await_statement_hints_cache

Previously, crdb_internal.await_statement_hints_cache waited until
the hint cache frontier timestamp reached now() plus the closed
timestamp target duration and refresh interval. The frontier timestamp
indicates that we've seen all events up to that timestamp, so it's OK
to just wait until now() instead.

Epic: None

Release note: None

explain: display number of hints for EXPLAIN ANALYZE

This commit adds a top-level field to the EXPLAIN ANALYZE output
showing the number of hints applied to the statement if nonzero.

Informs #121502

Release note (sql change): EXPLAIN ANALYZE will now display the number
of hints from system.statement_hints applied to the executed statement.

ui: show whether statement hints were applied in DB console

This commit adds a field to the "plan details" shown for statements in
the "statement activity" page of the DB console. The new field indicates
whether any statement hints from system.statement_hints were applied
to the statement.

Informs #121502

Release note (sql change): Plan details in the "statement activity" page
of the DB console now show whether any hints from system.statement_hints
werer applied to the statement execution.

@DrewKimball DrewKimball requested review from a team and michae2 November 11, 2025 03:30
@DrewKimball DrewKimball requested review from a team as code owners November 11, 2025 03:30
@DrewKimball DrewKimball requested review from angles-n-daemons and removed request for a team November 11, 2025 03:30
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@yuzefovich reviewed 7 of 7 files at r1, 3 of 3 files at r2, 13 of 13 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons and @michae2)


-- commits line 22 at r2:
Should we do the same for regular EXPLAIN? I.e. the hints should be applied regardless of whether we execute the stmt. Given that we only show the count when non-zero it seems reasonable to show this by default too (i.e. not only in VERBOSE mode).


pkg/sql/opt/exec/execbuilder/testdata/explain_analyze line 291 at r2 (raw file):


query T
EXPLAIN ANALYZE SELECT k FROM kv WHERE k >= 100

nit: if we're only interested in one line, perhaps apply LIKE '%hints count%' filter.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

@michae2 reviewed 7 of 7 files at r1, 3 of 3 files at r2, 13 of 13 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons, @DrewKimball, and @yuzefovich)


-- commits line 22 at r2:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Should we do the same for regular EXPLAIN? I.e. the hints should be applied regardless of whether we execute the stmt. Given that we only show the count when non-zero it seems reasonable to show this by default too (i.e. not only in VERBOSE mode).

Good point, this does seem worth showing in EXPLAIN as well.


pkg/sql/hints/hint_cache.go line 385 at r1 (raw file):

	// Wait in intervals of at least 100 milliseconds to avoid busy-waiting.
	const minWait = time.Millisecond * 100
	waitUntil := c.clock.Now().WallTime

When I used only Now() I was getting failures in:

./dev testlogic base --config=local --files=statement_hint_builtins --ignore-cache --stress

(Usually whenever kv-batch-size = 1.)

But if this is passing for you, then I'm happy.


pkg/sql/hints/hint_cache.go line 393 at r1 (raw file):

		case <-ctx.Done():
			return
		case <-time.After(max(time.Duration(frontier-waitUntil), minWait)):

I might be misunderstanding, but if frontier <= waitUntil, then won't frontier - waitUntil always be <= 0? Meaning this is always minWait?

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons, @michae2, and @yuzefovich)


-- commits line 22 at r2:

Previously, michae2 (Michael Erickson) wrote…

Good point, this does seem worth showing in EXPLAIN as well.

Done.


pkg/sql/hints/hint_cache.go line 385 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

When I used only Now() I was getting failures in:

./dev testlogic base --config=local --files=statement_hint_builtins --ignore-cache --stress

(Usually whenever kv-batch-size = 1.)

But if this is passing for you, then I'm happy.

I became concerned there was a bug somewhere, so spent some time digging. It turns out that when UseRangeTombstonesForPointDeletes is set for the logic test, we replace SQL deletes with MVCC range tombstones (which are only normally placed for specific ops like TRUNCATE that shouldn't happen for system tables). Rangefeeds don't currently know how to handle that, so the delete is skipped. I think the increased wait time probably "fixes" it because the tombstone is compacted away.

Anyway, with UseRangeTombstonesForPointDeletes disabled for that test, it passes even with no fudge factor in the Await loop.


pkg/sql/hints/hint_cache.go line 393 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

I might be misunderstanding, but if frontier <= waitUntil, then won't frontier - waitUntil always be <= 0? Meaning this is always minWait?

Oops, dumb mistake. Fixed it.


pkg/sql/opt/exec/execbuilder/testdata/explain_analyze line 291 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: if we're only interested in one line, perhaps apply LIKE '%hints count%' filter.

EXPLAIN ANALYZE can only be a top-level statement. And we can't do it for EXPLAIN either, because the hints won't apply.

@github-actions
Copy link

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

  • If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
  • If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

@github-actions github-actions bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Nov 13, 2025
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzefovich reviewed 2 of 18 files at r4, 5 of 5 files at r5, 13 of 13 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons and @michae2)


-- commits line 22 at r2:

Previously, DrewKimball (Drew Kimball) wrote…

Done.

We should adjust the release note to mention EXPLAIN.


pkg/sql/opt/exec/execbuilder/testdata/explain_analyze line 291 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

EXPLAIN ANALYZE can only be a top-level statement. And we can't do it for EXPLAIN either, because the hints won't apply.

Right, nvm.


pkg/sql/hints/hint_cache.go line 223 at r4 (raw file):

	ctx context.Context, update rangefeedcache.Update[*bufferEvent],
) {
	if update.Type == rangefeedcache.CompleteUpdate || len(update.Events) == 0 {

Can you expand on the reason for this change?

If I'm reading the code right, the rangefeed buffer is flushed every time the frontier TS is bumped (and after the initial scan is complete), and all events are included in the incremental update. (The comment on Buffer says that if the capacity is exceeded when adding an event, then the error is returned. Which I guess we just ignore because we pass nil /* onError */.) In other words, every incremental update corresponds to the frontier TS bump, which is how the code previously behaved.


pkg/sql/opt/exec/execbuilder/testdata/explain line 2591 at r6 (raw file):


query T
EXPLAIN SELECT k FROM t_hints WHERE k >= 100

Looks like the hint count is missing here.

Previously, `crdb_internal.await_statement_hints_cache` waited until
the hint cache frontier timestamp reached `now()` plus the closed
timestamp target duration and refresh interval. The frontier timestamp
indicates that we've seen all events up to that timestamp, so it's OK
to just wait until `now()` instead. This commit also fixes a flake in
the `statement_hint_builtins` logic test due to a test flag
`UseRangeTombstonesForPointDeletes`.

Epic: None

Release note: None
This commit adds a top-level field to the output of `EXPLAIN` and
`EXPLAIN ANALYZE` showing the number of hints applied to the statement,
if nonzero.

Informs cockroachdb#121502

Release note (sql change): EXPLAIN and EXPLAIN ANALYZE will now display
the number of hints from `system.statement_hints` applied to the executed
statement.
This commit adds a field to the "plan details" shown for statements in
the "statement activity" page of the DB console. The new field indicates
whether any statement hints from `system.statement_hints` were applied
to the statement.

Informs cockroachdb#121502

Release note (sql change): Plan details in the "statement activity" page
of the DB console now show whether any hints from `system.statement_hints`
werer applied to the statement execution.
Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons, @michae2, and @yuzefovich)


-- commits line 22 at r2:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

We should adjust the release note to mention EXPLAIN.

Done.


pkg/sql/hints/hint_cache.go line 223 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Can you expand on the reason for this change?

If I'm reading the code right, the rangefeed buffer is flushed every time the frontier TS is bumped (and after the initial scan is complete), and all events are included in the incremental update. (The comment on Buffer says that if the capacity is exceeded when adding an event, then the error is returned. Which I guess we just ignore because we pass nil /* onError */.) In other words, every incremental update corresponds to the frontier TS bump, which is how the code previously behaved.

Yes, I think you're right. I had convinced myself that we might be receiving multiple incremental updates at the same timestamp, but that was before I found the thing with range tombstones.


pkg/sql/opt/exec/execbuilder/testdata/explain line 2591 at r6 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Looks like the hint count is missing here.

Done. Just needed to regenerate the test.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

@michae2 reviewed 1 of 18 files at r4, 19 of 19 files at r7, 5 of 5 files at r8, 13 of 13 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @angles-n-daemons, @DrewKimball, and @yuzefovich)


pkg/sql/hints/hint_cache.go line 385 at r1 (raw file):
Ahh that explains it! Nice digging!

which are only normally placed for specific ops like TRUNCATE that shouldn't happen for system tables

Interesting. I guess we will need to remember that TRUNCATE won't be picked up by the cache unless we follow it with another mutation like a DELETE FROM system.statement_hints.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzefovich reviewed 19 of 19 files at r7, 5 of 5 files at r8, 13 of 13 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @angles-n-daemons and @DrewKimball)


-- commits line 41 at r9:
nit: s/werer/were/.


pkg/sql/hints/hint_cache.go line 389 at r7 (raw file):

	// Await is only used for testing, so we don't need to wake up immediately. We
	// can get away with polling the frontier time.
	for frontier := c.frontier.Load(); frontier <= waitUntil; frontier = c.frontier.Load() {

nit: not important (and perhaps impossible), but why should we wait when frontier == waitUntil?

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @angles-n-daemons, @michae2, and @yuzefovich)


pkg/sql/hints/hint_cache.go line 385 at r1 (raw file):

Interesting. I guess we will need to remember that TRUNCATE won't be picked up by the cache unless we follow it with another mutation like a DELETE FROM system.statement_hints.

There might be other ways it could happen, but TRUNCATE at least isn't allowed for system tables. I guess that's because it drops and recreates the table?


pkg/sql/hints/hint_cache.go line 389 at r7 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: not important (and perhaps impossible), but why should we wait when frontier == waitUntil?

Might not matter in practice, but we're comparing wall time here, so even if it's equal, the logical frontier timestamp could technically still be earlier than now().

@DrewKimball
Copy link
Collaborator Author

TFYRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 17, 2025

@craig craig bot merged commit b8111a6 into cockroachdb:master Nov 17, 2025
33 of 34 checks passed
@DrewKimball DrewKimball deleted the hint-observe branch November 25, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. v26.1.0-prerelease

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants